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© A method of controlling congestion in a virtual 
circuit packet network. An inrtiaJ packet buffer is 
assigned to each virtual circuit at each node into 
which incoming packets are stored and later re- 
moved for forward routing, if a larger buffer is de- 
sired for a virtual circuit to service a larger amount of 
data, then additional buffer space Is dynamfcafly 
£ji allocated selectively to the virtual circuit on demand 
^ if each node has sufficient unallocated buffer space 
O to fill the request In one embodiment the criterion 
fjj for dynamic allocation is based on the amount of 
m data buffered at the data source. In alternative em- 
O bodiments, the criteria for dynamic allocation may 
2 be further based on the amount of data buffered at 
~ each node for a virtual circuit and the total amount of 
O free buffer space at each node of a virtual circuit 
£^ Signaling protocols are disclosed whereby data sour- 
III ces and virtual circuit nodes maintain consistent 
Information describing the buffer allocations at all 
times. 
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METHOD AND APPARATUS FOR CONGESTION CONTROL IN A DATA NETWORK 



Technical Field 

The present Invention relates to data networks 
in general and more particularly to protocols, meth- 
ods and apparatus that improve the flow of in- 
formation wfthin such networks. 

Background of the Invention 

Packet-switched networks for the transport of 
cfigrtal data are well known in the prior art Typi- 
cally, data are transmitted from a host connecting 
to a network through a series of network links and 
swncnes to a receiving nose. Messa ge s from tno 
transmitting host are dMded into packets that are 

l-rj-i— i_iul1I tj lj I H i_ 1 1 _ , .., 1 , 1 1 -1 - -III 

iiBnsmiUBa tnrougn me network ana reassempieo 
at the receiving host In virtual circuit networks, 
which are the subject of the present invention, all 
data packets transmitted during a single session 
between two hosts follow the same physical not- 
work path. 

Owing to the random ratine of data traffic, data 
may arrive at a switching node of the network at an 
instantaneous rate greater than the transmission 
speed of the outgoing Bnk, and data from some 
virtual circuits may have to be buffered until they 
can be transmitted Various queueing disciplines 
are Known hi tno poor art tarry oata networks 
typically used some form of flrsHn-ffrst-out (FIFO) 
queueing service. In RFO service, data packets 
arriving from different virtual circuits are put into a 
single buffer and transmitted over the output Qnk in 
the same order in which they arrived at the buffer. 
More recently, some data networks have used 
queueing disciplines of round robin type. Such a 
network is descrfeed in a paper by AG. Fraser 
emitted, TOWARDS A UNIVERSAL DATA 
TRANSPORT SYSTEM," and printed in the IEEE 
Journal on Selected Areas In Communications, No- 
vember 1963. Round robin service involves keep- 
ing the arriving data on each virtual circuit in a 
separate per-circuit buffer and transmitting a small 
amount of data in turn from each buffer that con- 
tains any data, until all the buffers are empty, U.S. 
Pat No. 4,583,219 to Riddle describes a particular 
rouno room emDoatmont mat gives low oetay to 
messages consisting of a small amount of data, 
Many other variations also fall wfthin the spirit of 
round robin service. 

RrsMrhfirst-out queueing olscipOnes are some- 
what easier to implement than round room <fis- 
cfpUnes. However, under heavy-traffic conditio ns 
firsHn-firsHxit dsctpfines can be unfair. This Is 
explained in a paper by S.P. Morgan entitled, 
"QUEUEING DISCIPLINES AND PASSIVE CON- 



GESTION CONTROL IN BYTE-STREAM NET- 
WORKS," printed in the Proceedings of IEEE IN- 
FOCOM April 1989. When many users are con- 
tending for limited transmission resources, ffrsHn- 
s first-out queuemg gives essentially all of the band- 
width of congested links to users who submit long 
messages, to the exclusion of users who are at- 
tempting to transmit short messages. When there 
Is not enough bandwidth to go around, round robin 
10 disciplines divide the available bandwidth equally 
among all users, so that light users are not locked 
out by heavy users. 

On any data connection it is necessary to keep 
me Pnansminer from overrunning me receiver, rots 
is is commonly done by means of a siding-window 
protocol, as described by AS. Tanenbaum in the 
book COMPUTER NETWORKS, 2nd ed., published 
by Prentice Hall (1988),pp.223-239. The transmitter 
sends data in units caOed frames, each of which 
20 carries a sequence number. When the receiver has 
received a frame, ft returns the sequence number 
to the transmitter. The transmitter Is permitted to 
have only a limited number of sequence numbers 
outstanding at once; that is, it may transmit up to a 
25 specified amount of data and then it must waft until 
it receives the appropriate sequential acknowledge 
ment oeiore vansmnnng any new oata. rr an ex- 
pectea acioiowieogment does not arrive witntn a 
specmeo omo irrcervai, tne transmrtter retransmits 
30 one or more frames. The maximum number of bits 
that the transmitter is allowed to have in transmit at 
any given time Is called the window size and will 
be denoted here by W. The maximum number of 
outstanding sequence numbers is also sometimes 
35 called the window size, but that usage will not be 
followed here. 

Suppose that the transmitter and receiver are 
connected by a circuit of speed S bits per second 
with a round-trip propagation time T D seconds, and 
40 that they are able to generate or absorb data at a 
rate not less than S.Let W be the window' size. 
Then, to maintain continuous transmission on an 
otherwise Idle path, W must be at least as large as 
the round-trip window W©, where W 0 Is given by 
46 W 0 =STo. W 0 is sometimes called the delay-ban*- 
width product If the circuit p asses through a num- 
ber of links whose speeds are different, then S 
represents the speed of the slowest Dnk. If the 
window is less man the round-trip window, then the 
so average traction or me neiwonc Danawiain max me 
circuit gets cannot exceed W.W„. 

In principle, if a circuit has a window of a given 
size, buffer space adequate to store the entire 
window must be available at every queueing point 
to prevent packet loss fn all cases, since forward 
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progress can momentarily com© to a halt at the 
beginning of any link. This Is explained In more 
detail below. On a lightly loaded network, signifi- 
cant delays are unlikely and there can generally be 
sharing of buffer space between circuits. However, 5 
the situation is different when the network is con- 
gested. Congestion means that too much traffic has 
entered the network, even though individual circuits 
may all be flow controlled. Uncontrolled congestion 
can lead to data loss due to buffer overflow, or to . 10 
long delays that the sender interprets as losses. 
The losses trigger transmissions, which lead to an 
unstable situation in which network throughput de- 
clines as offered load increases. Congestion in- 
stability comes about because whenever data has is 
to be retransmitted, the fraction of the network's 
capacity that was used to transmit the original data 
has been tost In extreme cases, a congested net- 
work can deadlock and have to be restarted. 

Congestion control methods are surveyed by 20 
Tanenbaum, op. clt, pp. 287-88 and 309-320. 
Many congestion control methods involve the sta- 
tistical sharing of buffer space in conjunction wfth 
trying to sense the onset of network congestion. 
When the onset of congestion Is detected, attempts 2s 
are made to request or require hosts to slow down 
their input of data Into the network. These tech- 
niques are particularly me ones that are subject to 
congestion instability. Abusive hosts may continue 
to submit data and cause buffer overflow. Buffer so 
overflow causes packet losses not only of a host 
submitting the packets that cause the overflow, but 
also of other hosts. Such packet toss then gives 
rise to retransmission requests from aO users losing 
packets and it Is this effect that pushes the network 36 
toward instability and deadlock. Alternatively, as 
mentioned above, it has been recognised for a long 
time that congestion Instability due to data loss 
does not occur In a virtual-drcuit network, provided 
that a full window of memory is allocated to each 40 
virtual circuit at each queuemg node, and provided 
that If a sender times out It does not retransmit 
automatically but first Issues an inquiry message to 
determine the last frame correctly received. If fun 
per-circuit buffer allocation is combined wfth an 45 
Intrinsically fair queuelng discipline, that is, some 
variant of round robin, the network is stable and as 
fair as ft can be under the given toad. 

The DATAKJT (Registered trademark) network 
is a virtual circuit network marketed by AT&T that so 
operates at a relatively tow transmission rate and 
provides full window buffering for every virtual cir- 
cuit as Just described. This network uses technol- 
ogy similar to that disclosed in U.& Patent Re 
31,319, which reissued on July 10, 1983 from A.GL « 
Fraser's U.S. Patent No. 3,740345 of Jury 31. 
1973, and operates over relatively tow-speed 71 
channels at approximately 1S megabits per sec- 



ond. The DATAKTT network Is not subject to net- 
work Instability because of full-window buffering for 
each virtual circuit and because data loss of one 
host does not cause data loss of other users. 
Dedicated full-window buffering is reasonable for 
such low-speed channels; however, the size of a 
data window increases dramatically at speeds high- 
er than 1.5 megabits per second, such as might be 
used in fiber-optic transmission. If N denotes the 
maximum number of simultaneously active virtual 
circuits at a node, the total buffer space that is 
required to provide a round-trip window for each 
circuit is NST*. It may be practicable to supply this 
amount of memory at each node of a tow-speed 
network of limited geographical extent However, at 
higher speeds and network sizes, ft ultimately 
ceases to be feasible to dedicate a full round-trip 
window of memory for every virtual circuit For 
example, assuming a nominal transcontinental 
packet round-trip propagation time of 60 ms, a 
buffer memory of 11 kilobytes is required for every 
circuit at every switching node for a 1.5 megabits 
per second transmission rate. This increases to 33k 
kilobytes at a 45 megabits per second rate. 

A need exists tor solutions to the problem of 
avoiding congestion instability,, while at the same 
avoiding the burgeoning buffer memory require- 
ments of known techniques, ft is therefore an over- 
all object of the present invention to retain the 
advantages of fuihwindow buffering while substan- 
tially reducing the total amount of memory re- 
quired. 

ft is another object of the invention to reduce 
the amount of buffering required for each circuit by 
the sharing of buffer memory between circuits and 
by dynamic adjustment of window sizes for circuits. 

U.S. Pat No. 4,736,369 to Barzilai et al. ad- 
dresses some aspects of the problem of adjusting 
window sizes dynamically during the course of a 
user session, in response to changes In traffic 
patterns and buffer availability. However, this patent 
assumes a network in which flow control and win- 
dow adjustment are done on a link-by-fink basis, 
that Is, as a result of separate negotiations between 
every pair of adjacent nodes on the path between 
transmitter and receiver. For high-speed networks, 
link-by-fink flow control Is generally considered to 
be less suitable than end-to-end control, because 
of the additional computing toad that Bnk-by-Bnk 
control puts on the network nodes. 

Thus, ft Is an another object of the invention to 
perform flow control on an end-to-end basis wfth 
dynamicafty adjustable windows. 

Summary of the Invention 

The Invention is a method of controinng con- 
gestion in a virtual circuit data network. A data 
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' buffer is assigned to each virtual circuit at each 
node Into which incoming data is stored and later 
removed for forward routing, the size of a buffer for 
each virtual circuit at a switching node is dynam- 
ically allocated in response to signals requesting 
increased and decreased data window sizes, re- 
spectively. If a larger buffer is desired for a virtual 
circuit to service a larger amount of data, then 
additional buffer space is dynamically allocated se- 
lectively to the virtual circuit on demand if each 
node has sufficient unallocated buffer space to fill 
the request Conversely, the allocated buffer space 
for a circuit is dynamically reduced when the data 
source no longer requires a larger buffer size. In 
one embodiment, the additional space is allocated 
to a virtual circuit in one or more blocks of fixed 
size, up to a maximum of a full data window, 
wherein a fufl data window is defined as the virtual 
circuit transmission rate multiplied by a representa- 
tton or me network round trip propagation aetay. m 
a second embodiment, the additional allocation is 
done In blocks of variable size. 

The size of a block to be allocated at each 
node of a virtual circuit is determined based on the 
amount of data waiting to be sent at the packet 
source, and on the amount of unallocated buffer 
space at each said node. It may also be based on 
the amount of data already buffered at each said 
node. 

To perform the additional allocation at each 
node of a virtual circuit In a representative embodi- 
ment of the invention a first control message Is 
transmitted along a virtual circuit from the first 
node in the circuit to the last node in the circuit 
Each node writes Information into the first control 
message as it passes through describing the 
amount of unallocated buffer space at the node and 
the amount of data already buffered at the node. 
The last node in the virtual circuit returns the first 
control message to the first node where the size of 
an allocated block is determined based on the 
information in the returned first control message. A 
second control message is then transmitted from 
the first node to the last node In the virtual circuit 

oner uescnpoon or trie urawtng 
In the drawing, 

Rg. 1 discloses the architecture of a typical 
data switching network having a plurality of switch- 
ing nodes connected to user packet host sources 
and destinations; 

Rg. 2 discloses illustrative details of a data 
receiving and queueing arrangement at a node for 
an Incoming channel having a plurality of mul- 
tiplexed time slots corresponding to individual vir- 
tual circuits; 



Rg. 3 discloses illustrative details of a control- 
ler of Rg. 2 that administers the buffer space 
allocation and data queueing of virtual circuits on 
an Incoming channel; 

s Rg. 4 discloses illustrative details of a router 
that converts between variable-length data packets 
from a host and constant-length data ceils and 
further administers the buffer space allocation and 
data queueing at the router; and 

io Rg. 5 shows an illustrative method of determin- 
ing buffer lengths of data for a virtual circuit at a 
router or switching node; 

Rgs. 6 and 7 show illustrative flowcharts de- 
picting the protocols and method steps performed 

rs at the routers and nodes of dynamically allocating 
buffer space for a virtual circuit at routers and 
nodes for an embodiment in which buffer lengths at 
input routers are used as decision criteria for dy- 
namic buffer allocation; and 

20 Rgs. 8 through 12 disclose flowcharts deplet- 
ing the protocols and method steps performed at 
the routers and nodes for allocating buffer space in 
blocks of fixed or varying sizes to virtual circuits in 
an emDooimenr in wnicn Duner jongirts at nooes 

2s are used in conjunction with buffer lengths at rout- 
ers as decision criteria. 

Detailed Description 

so Rg. 1 shows a block cBagram of an illustrative 
packet-switching network. It Is assumed that the 
network Interconnects many packet sources and 
destinations by means of virtual circuits among a 
number of routers and switching nodes. Packet 

35 sources and destinations are attached to local area 
networks that are on user sites. For examples 
source 102 is connected to a local network 106, 
which is connected to a router 100. One of the 
functions of the router Is to convert between the 

40 variable-length data packets issued by the source 
and the amstanHength data ceils transmitted and 
switched by the cefl network 100. While ceils are 
considered to be of fixed length, this is not a 
limitation of the invention. Other functions of the 

46 router relevant to the invention wfO be described 
betow. 

The router attaches the local network 106 to 
the cell network 100 via the access Ine 10a Date 
ceils belonging to a particular virtual circuit are 

so transmitted through a sequence of switching nodes 
114 and data links 118 to an access tine 118 that is 
connected to a router 120. The router 120 reas- 
semutes tne oata ceus into oata packets auuressea 
to a particular destination, and transmits the pack- 

56 ets to the local network 124, from whence they are 
taken by the destination 12a 

ft is assumed for purposes of disclosure that 
the network 100 Is similar to the DATAKJT (R) 
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virtual circuit network marketed by AT&T, except 
that the network 100 operates at a considerably 
higher transmission rate. That Is, It Is assumed that 
network 100 establishes a virtual circuit path be- 
tween a source router and a destination router via 
selected ones of the switching nodes 114 when a 
connection is first initiated. Packets passing from a 
source to a destination are routed via the virtual 
circuit for the duration of the connection, although 
the actual transmission Hnes and bandwidth on the 
transmission fines in the path are not dedicated to 
the connection in question, but might be time- 
shared among many such connections. 

In accordance with the invention, Rg. 2 shows 
an illustrative embodiment of a ceil buffering ar- 
rangement at a node. This buffering arrangement is 
able to handle many virtual circuits. Buffer space is 
allocated per-virtuah3rcurt and the allocation for a 
virtual circuit can be changed dynamically, under 
control of the monitor 200. The monitor is a con- 
ventional microprocessor system that is used to 
implement congestion control mechanisms to be 
described later. The receiver 202 and transmitter 
204 In the figure are converrtionai. and the transmit- 
ter may implement round robin service among the 
virtual circuits using established techniques. 

When a cell arrives, the receiver 202 deter- 
mines whether the ceil is a congestion message as 
indicated by a bit in the header. Congestion mes- 
sages are stored in a separate FIFO queue 206 for 
the monitor. If an arriving cell is not a congestion 
message, the receiver 202 produces a virtual cir- 
cuit number on bus WVC and a write request on 
lead WREQ. The receiver places the ceO on its 
output bus 208 where it is buffered in the cell 
queue 210 under the control of the controller 212. 
The cell queue 210 is a memory array of some 
suitable size, which for the purposes of exposition 
is organised In words which are one cell wide. 

The receiver 202 and the transmitter 204 are 
autonomous circuits. Each operates independently 
of the other to enter cells to and remove ceils from 
the cell queue 210. respectively. When the trans- 
mitter 204 is ready to send a cell, H produces a 
virtual circuit number on bus RVC and a read 
request on lead RREQ. If the allocated buffer En 
queue 210 associated with virtual circuit RVC is 
empty, the controller 21 2 will indicate this condition 
by setting signal EMPTY to a value of TRUE and 
the transmitter can try another virtual circuit Other- 
wise, the next cell in the buffer associated with 
RVC wit) appear on the output bus to be read by 
the transmitter 204. The controller 212 controls the 
cell queue via signals on bus MADDR and toads 
MW and MR. MADDR is the address in the cell 
queue 210 at which the next cell Is to be written or 
read. MW end MR signify a queue write or read 
operation, respectively. Congestion messages gen- 



erated by the monitor 200 are stored in a separate 
outgoing FIFO 214. These messages are multi- 
plexed with outgoing ceils onto the transmission 
line 216 by the transmitter, 
s To implement congestion control schemes, the 
monitor 200 has access to data structures internal 
to the controller 212 over the buses ADDR, R, W. 
and DATA. These data structures include the in- 
stantaneous buffer length for each virtual circuit 

w and the overall number of cells in the cell queue. 
Averaging operations required to implement con- 
gestion control, according to the protocols de- 
scribed below, are performed by the monitor 200. 
Fig. 3 shows illustrative details of the controller 

is 212 of Fig. 2. The major functions of the controller 
are to keep track of the buffer allocation for each 
virtual circuit, to keep track of the instantaneous 
buffer use (buffer length) for each virtual circuit, to 
manage the allocation of memory in the cell queue 

20 such that data can be buffered for each virtual 
circuit In a dedicated buffer of dynamically varying 
length, and to control the writing and reading of 
data in the cell queue as it is received and trans- 
mitted. For the purposes of exposition, memory is 

25 partitioned In the queue to units of one cell. This 
section first describes the basic elements of the 
controller, and then describes the operations of 
these elements in detail. 

An arbiter 300 receives signals WREQ and 

ao RREQ, which are requests to write a ceil to a buffer 
associated with a particular virtual circuit or to read 
a ceil from the buffer associated with a particular 
virtual circuit respectively. The arbiter insures that 
read and write operations occur in a non-interfering 

36 manner, and that the select input to the multiplexer 
(W.O-RR) is set such that input RVC is present on 
bus VC during read operations and input WVC is 
present on bus VC during write operations. The 
remainder of this discussion will consider read and 

40 write operations separately. 

A table COUMT.TABLE 304 is provided for 
storing the buffer allocation and buffer use for each 
virtual cfrcuft. The table i3 addressed with a virtual 
circuit number on bus VC from the multiplexer 302. 

45 Bich virtual circuit has two entries in C 
OUNT.TABLE. One entry. UMITTVC1 contains the 
maximum number of cells of data that virtual circuit 
VC is presently allowed to buffer. This, in turn, 
determines the window size allocated to the virtual 

so circuit The second entry, COUMTtVCJ, contains 
the number of ceils that are presently used in the 
cefl queue 210 by virtual circuit VC. The conten ts 
of COUMT.TABLE can be read or written by the 
monitor 200 at any time before or during the opera- 

55 tion of the controller 21 2. 

A table QUEUE^ POINTERS 306 contains the 
read and write pointers for the buffer associated 
with each virtual circuit Read pointer RP(VC] re- 
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ferences the location containing the next cell to be 
read from the buffer associated wtth virtual circuit 
VC; write pointer WPfVC] references the next loca- 
tion to be written in the buffer associated with 
virtual circuit VC. 5 

Buffers of dynamically varying length are main- 
tained by keeping a linked list of ceils for each 
virtual circuit The linked fists are maintained by the 
UST.MANAGER 308, which also maintains a linked 
list of unused cells that make up the free buffer to 
space. Operation of the UST_MANAGER is de- 
scribed below. 

A GLO BAL_COU NT register 310 keeps track 
of the total number of ceils in all virtual circuit 
buffers. If each virtual circuit is initialized with one rs 
(unused) eel) in its buffer, the irotiaJ value of the 
GLOBAL-COUNT register is equal to the number of 
virtual circuits. The GLOBALCOUNT register can 
be written or read by the monitor. The 
TIMING + CONTROL circuit 312 supples aD of the 20 
control signals needed to operate the controller. 

Prior to the start of read request or write re- 
quest operations, the controller is inrtiaized by the 
monitor. For each virtual circuit, WPfVC] and RP- 
[VC] are initialized with a unique cell number and 28 
COUNTTVC] is inrtiaHzed with a value of 1, repre- 
senting an empty buffer with one (unused) ceil 
present for receipt of incoming data The initial 
value of UMI7TVC] is the Initial buffer allocation for 
that virtual circuit which is equivalent to its Initial so 
window size. The USTJUANAGER is initialized 
such that the free Bst Is a linked list containing all 
cells In the ceO queue 210 except those which are 
inrtiafized in table QUEUE_POINTERS. 

When a cell arrives, the receiver asserts a write x 
request on WREO and the virtual circuit number on 
WVC. Bus VC is used to address COUNTTABLE 
causing the values in the COUNTTVC] and UM1T- 
[VC> fields to be sent to a comparator 314. If the 
virtual circuit in question has not consumed all of 40 
its allocated space in the ceil queue, i*. if COUNT- 
[VC] Is less than UMITTVCJ In the table, the com- 
parator will generate a FALSE value on lead 
UMfTREACHED. Bus VC is also used to address 
the QUEUEPOfNTERS table such that WPfVC] is 46 
present on bus MAODR When UMfTREACHED is 
FALSE, the timing and control circuit win generate 
signal MW which causes the ceil to be written to 
the cell queue 210, and will control the L 
IST.MANAGER to cause a new ceil to be allocated so 
and QnfcBd Into the buffer associated with VC. In 
addition, the buffer use for VC and the overall ceil 
count values will be updated. To update the buffer 
use, the present value in COUNTTVC] will be rout- 
ed via bus COUNT to an up/down counter, which ss 
increments the present number of cells recorded in 
COUNTTVC] by one. This new value, appearing on 
bus NCOUNT, is present at the input of 



COUNT_TABLE, and will be written into the table. 
The overall cell count Is incremented in a similar 
manner using register GLOBALCOUNT 310 and an 
up/down counter 316. 

If. during a write operation, UMfTREACHED Is 
TRUE, which means that the virtual circuit in ques- 
tion has consumed all of its allocated space in the 
cell queue, the T+C circuit 312 will not generate 
signals to write data into the cell queue, to allocate 
a new cell, or to increment the value of COUNT- 
[VC] or GLOBAL__COU NT. Accordingly, any VC 
exceeding its assigned window size loses the cor- 
responding ceils, but the data for other virtual cir- 
cuits is not affected. 

When the transmitter is ready to send a new 
cefl. it asserts a read request on lead RREQ and 
the virtual circuit number on bus RVC. C 
OUNT.TABLE is accessed causing the value of 
COUNTTVC] to be sent to a comparator 318, whose 
second input is the value zero. If the buffer asso- 
ciated with VC contains no data, the comparator 
318 wiQ generate a TRUE signal on EMPTY, and 
the operation will be terminated by the 
TIMING + CONTROL circuit 312. If EMPTY is 
FALSE, the up/down counter 320 wifl decrement 
the value of COUNTTVC], and the resulting value 
will be written into COUNTTABLE 304. in this 
case, the value of RP[VC] from QUEUEPOfNTERS 
is present on bus MADOR and the MR signal is 
generated, reading a cefl from the cefl queue 210. 
RPfVC] is also input to the UST_MANAGER 308 
so that the cell can be deallocated and returned to 
the free store. The address of the next ceO in the 
buffer for VC Is present on bus NRP and is written 
into QUEUE_POJNTERS 306. The overafi count of 
cells buffered, which is stored in 
GLOBAL__COUNT 310, is decremented. 

The UST^ MANAGER 308 maintains a Inked 
Est of memory locations which make up cell buffers 
for each virtual circuit It also maintains a finked Bst 
of memory locations which make up the free Dst 
The USTMANAGER 308 contains a link memory. 
LNKMEM 322. which contains one word of informa- 
tion for every cefl In the cefl queue 210. The width 
of a word in LNKMEM 322 Is the logarithm to base 
2 of the number of celts in the cell queue 210. 
There is a register, FREE 324, which contains a 
pointer to the first entry in the free Bst. 

Consider the buffer for virtual circuit VC. The 
read pointer RP[VC] points to a location in the ceil 
buffer at which the next cefl for virtual circuit VC is 
to be read by the transmitter. RP[VC] points to a 
location in LNKMEM 322 which contains a pointer 
to the next cefl to be read from the cefl queue 210 
and so on. Proceeding in this manner, one arrives 
at a location in LNKMEM 322 which points to the 
same location pointed to by WPfVCL tn the cell 
queue 210 this location Is an unused location which 
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is available for the next cell to arrive for VC. 

Free space in the cell queue 210 is tracked in 
LNKMEM 322 by means of a free list The begin- 
ning of the free list is maintained in a register 
FREE 324 which points to a location in the cell 
queue 210 which is not on the buffer for any virtual 
circuit FREE points in LNKMEM 322 to a location 
which contains a pointer to the next free ceil, and 
soon. 

When a write request occurs for a virtual circuit 
VC, if VC has not exceeded its buffer allocation, a 
new celt will be allocated and finked into the buffer 
associated with VC. The value in WPfVC] is input 
to the USTJtfANAGER 308 on bus WP at the 
beginning of the operation. A new value NWP of 
the write pointer is output by the UST_MANAGER 
308 at the end of the operation. NWP will be 
written into table QUEUE_POINTCRS 306. This 
occurs as foOowK 

1) The value in register FREE 324, which repre- 
sents an unused cell, will be chained into the 
Inked fist associated with VC. and will also be 
output as NWP. 

NWP ■ LNKMEM[WP] = FREE 

2) The next free location In the free list will be 
written Into FREE 324. 

FREE = LNKMEMfFREE] 
When a read request occurs for a virtual circuit 
VC, the cell which is currently being read, namely 

RPtVCl wHl be input to the UST MANAGER 308 

on bus RP to be returned to the free list and the 
next cell in the buffer associated with VC wiD be 
returned as NRP. NRP will be written into table 
Q U EU E PCM NTERS 306. This occurs as follows: 

1) A new read pointer is returned which points 
to the next ceil in the buffer associated with VC. 

NRP =» LNKMEMfRP] 

2) The cell which was read in this cycle is 
deallocated by finking it into the free fist 

LNKMEMfRP] = FREE 
FREE»RP 

Fig. 4 is an illustrative embodiment of a router, 
such as 110 of Fig. 1. Variable length packets 
arriving from the local area network 106 of Fig. 1 
are received by the LAN receiver 400 at the upper 
left of Fig. 4. A global address, present In each 
packet, is translated to a virtual circuit number by 
the translation circuit 402. Since the packet will be 
transported using fixed length ceils that may be 
smaller or larger than the length of the particular 
packet under consideration, additional header or 
trailer bytes may need to be added to the packet to 
facilitate reassembly of the packet from a se- 
quence of ceDs which arrive at the destination 
router, to allow a destination router to exert flow 
control over a source router, or to allow dropped or 
misoSrected cells to be detected. The resulting 
Infor ma tion must be padded to a length which is an 



integral multiple of the cell size. These functions 
are not pertinent to the Invention; however, an 
illustrative embodiment is described to Indicate the 
relationship of these functions to the congestion 
s management functions that must be performed by 
the router. 

The LAN packet and the virtual circuit number 
produced by the translation circuit 402 are passed 
to segmentation circuit 404, which may add header 

w or trailer bytes to the packet either for the func- 
tions described above or as placeholders for such 
bytes to be supplied by a second segmentation 
circuit 40a The resulting information Is padded to 
an integral multiple of the ceil size and is stored in 

75 a ceil queue 406, which may be identical in struc- 
ture to the cod queue 210 described in Fig. 2. In 
particular, internal data structures in a controller 
410 may be accessed by monitor 412 that aJtow 
the buffer use (buffer length) to be monitored for 

20 each virtual circuit and that allow the buffer alloca- 
tion per virtual circuit to be adjusted dynamically. 
Segmentation circuit 408 performs window flow 
control on each virtual circuit where the window 
size for each virtual circuit may be varied dynam- 

28 icaJly under the control of the protocols described 
below. To perform window ftow control, segmenta- 
tion circuit 408 may fill in the added data bytes as 
appropriate to complete the reassembly and ftow 
control protocol. As a minimum, segmentation cir- 

90 curt 408 maintains a counter per virtual circuit 
which keeps track of the amount of outstanding, 
unacknowledged data that it has sent in order to 
implement window flow control, and it receives 
acknowledgments from the remote receiver indicat- 
es ing data that has passed safely out of the ftow 
control window. Techniques for implementing reas- 
sembly and window flow control are wed known in 
the art; the unique aspect of the Invention Is that 
the window sizes and buffer sizes may change 

40 dynamically under the influence of congestion con- 
trol messages. The transmitter 415 takes cells from 
segmentation circuit 408, from the local receiver as 
described below, and from the outgoing congestion 
FIFO 419 and sends them out on the outgoing ceil 

48 transmission line 418. 

Router 110 also receives ceils from network 
100 via the access (toe 112 of Fig. 1. These ceils 
arrive at the receiver 414 at the tower right comer 
of Fig. 4. Insofar as these cells result from packets 

so originated by the source 102 and intended for the 
destination 12a they will be either congestion mes- 
sages or acknowledgrnents from the remote router 
120. The handling of cells that may arrive on 
access Bne 112 from other sources, which are 

66 attempting to communicate wHh destinations at- 
tached to local network 106, will be deferred until 
the discussion of router 120 below. 

When a cefl of one of the two types under 
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consideration arrives, the receiver 414 determines 
whether the cell is congestion message as In- 
dicated by a bit in the header. Congestion mes- 
sages are stored in a separate FIFO queue 417 for 
the monitor 412 and handled according to one of s 
the protocols described below. If the protocol gen- 
erates a further congestion message, an appro- 
priate ceP is sent from the monitor 412 to seg- 
mentation circuit 408 and muWplexBd onto the out- 
going cell transmission line 416. If an arriving ceil w 
is not a congestion message, the receiver 414 
sends the ceU to reassembly circuit 418, which 
determines whether a cell is an acknowledgment 
from the remote router. If this is the case, reassem- 
bly circuit 418 sends an ackrowledgnrent-received is 
notification to segmentation circuit 408, so that it 
can update the count of the amount of outstanding 
data. 

A router identical in structure with Fig. 4 may 
also represent element 120 of Bg. 1 . In such case, 20 
the receiver 414 corresponding to such router 
takes ceOs from the outgoing access Ine 118 of 
Rg. 1. Insofar as these cells result from packets 
originated by the source 102 and intended for the 
destination 128, they will be either data cells or 2s 
congestion messages from the remote router 110. 
When a cell arrives, the receiver 414 determines 
whether the cell is a congestion message as in- 
dicated by a bit in the header. Congestion mes- 
sages are stored in a separate FIFO queue 417 for 90 
the monitor 412 and handled according to one of 
the protocols described below. If the protocol gen- 
erates a further congestion message, and appro- 
priate cefl is sent from the monitor 412 to seg- 
mentation circuit 408 and multiplexed onto the out- as 
- going cell transmission Bne 418 at the lower left of 
Rg. 4. If an arriving cefl is not a congestion mes- 
sage, the receiver 414 sends the cell to reassem- 
bly circuit 418, which buffers the arriving cell in a 
per-virtual circuit buffer in cefl queue 420. If the 40 
reassembly droit 418 detects that a complete 
local area network packet has been accumulated, 
reassembly circuit 418 sends a send-acknowtedg- 
ment command to the local transmitter 416 on lead 
422, which causes an acknowledgment message to 45 
be sent to the remote router 110. In addition, 
reassembly circuit 418 issues multiple-read re- 
quests to the buffer controller 422 causing the ceils 
which make up the packet to be sent in succession 
to reassembly circuit 424. To facilitate the reas- so 
sembfy procedure, reassembly circuit 424 may de- 
lete any header or trailer bytes which were added 
when the packet was converted to cells by router 
110. The packet Is then sent to the translation 
circuit 426, where the global address is translated 66 
into a local area network specific address before 
the packet Is sent onto the local are network 124. 



Choice of window sizes 

The operation of the apparatus and protocols 
described in this invention does not depend on the 
choice of window sizes. Various practical consider- 
ations may determine the window sizes that are 
used, rf there are only two window sizes, the follow: 
ing considerations lead to preferred relationships 
among the numbers of virtual circuits and the win- 
dow sizes. 

Suppose that the maximum number of virtual 
circuits that can be simultaneously active at a given 
node is No- Suppose further that it is decided to 
provide some number N n less than Nq of the virtual 
circuits wnn fuirsize windows Wq, wnue proviaing 
the remaining No-Ni virtual circuits with buffers of 
some smaller size Bo that Is adequate for Bght 
traffic If there are N| simultaneous users each of 
whom gets an equal traction of the channel, the 
fraction of the channel that each gets Is 1/Nj. The 
maximum fraction of the channel capac it y that can 
be obtained by a user having a window size Bg is 
Bq/Wq. Setting 1/Ni equal to the maximum fraction 
of the trunk that can be had by a user with a small 
buffer, namely Bo/Wo, gives the following relation- 
ship among the quantities: Wo/Bo =fi,. The total 
buffer space B allocated to all the virtual circuits is 

B » (N0-N1 )Bo + Ni Wo = N0B0-W0 + W 0 2 /Bo. 

Minimizing B with respect to B^ leads to 

Bo = Wo/(No) 1ft . 

N, » (No) 171 . 

B - ^(HO^IlWo. 

These equations provide preferred relationship 

among B^, Nf, No* and W©. 
If there are more than two window sizes, var- 
ious choices are possible. It may be convenient to 
choose the sizes in geometric progression, for ex- 
ample, increasing by powers of 2. An alternative 
approach that may be preferred in some instances 
is to have different sizes correspond to round-trip 
windows at various standard transmission speeds. 
Still other choices may be dictated by other cfr- 

Buffer Allocation Protocols 

The following discusses protocols by means of 
which sharabte buffer space Is allocated and deal- 
located and by means of which virtual-circuit 
nodes, routers, and hosts are so alerted The read- 
er Is directed to Rgs. 5 through 12 as required. 

Each node controller 212 keeps track of the 
buffer length of each of its virtual circuits via the 
entry COUNTTVC] in the table COUNT__TABLE 

|» — _ L_ — -« t*- - -» |_ n n ti n nllnii Jill O 

that nas been descrtDed tn cortnecoon wnn rig. a. 
cacn node controller also Keeps tracx ot me see or 

tef It**± .. fct.iL I - ■* flff 11 1 m m ■ 1 ImJ »| 

free use, wrucn ts tne omerence oecween ine 

(fixed) number of cods In the cell queue 210 of Rg. 
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2 and the contents of the register 
GLOBAL^ COUNT 310 described in connection 
with Fig. 2. All of these quantities are available to 
be read at any time by the node monitor 200 
shown In Fig. 2. In a similar way. each router keeps s 
track of the input buffer length of each of its virtual 
circuits, in a table that is available to the router 
monitor 412 shown in Fig. 4. For purposes of 
disclosure, it will be assumed that each router 
manages its cell queue 406, shown on the left side w 
of Fig. 4, in a manner similar to the switching 
nodes, so that quantities analogous to COUNT and 

GLOBAL. COUNT 310 are available to the router's 

monitor. 

It is unnecessary, but desirable, for the node 76 
controllers and the routers to maintain smoothed 
averages of buffer lengths. A popular smoothing 
procedure for the time-varying quantity q is given 
by the easily implementafoie recursive 
algorithm, so 
r„=(1-f)qn +ftv,, 

where cu represents the value of q at epoch n. r^ 
represents the moving average at epoch n-1,r» 
represents the moving average at epoch n. and f is 
a number between 0 and 1 that may be chosen to 2s 
control the length of the averaging interval, rf ob- 
servations are made at intervals of At seconds, the 
approximate averaging interval is T A v seconds, 
where 

T AV = (1-1/togf)At 90 
Appropriate averaging intervals for network conges- 
tion control may be between 10 and 100 round-trip 
times. 

In various embodiments of the present fnven- 36 
tion, up to four binary quantities are used with each 
virtual circuit as Indicators of network congestion. 
These quantities are defined as follows. 

BK3MNPUT. A repetitive program at a router 
is executed periodically (Fig. 5, step 500) to update 40 
this parameter, it is set equal to 1 (step 508) for a 
virtual circuit If a buffer in a cefl queue such as 406 
for that virtual circuit at the router 110 has been 
occupied during more than a certain fraction of the 
time in the recent past, and it is set equal to 0 45 
(step 510) if the buffer has not been occupied 
during more than that fraction of time. For the 
determination of BK3_INPUT, the quantity q in the 
niovtog-average algorithm (step 504) may be taken 
as 1 or 0 depending on whether or not any data is so 
found in the buffer at the given observation. The 
quantity r (step 506) is then an estimate of the 
fraction of time that the buffer has been occupied 
during the past T AV seconds. A representative but 
by no means exclusive threshold for r would be 66 
OS. 

SOME BACKLOG. This quantity Is set equal 

to 1 for a given virtual circuit at a given node 114 



or output router 120 if the virtual-circuit buffer at 
that node or router has been occupied during more 
than a certain fraction of the time in the recent 
past, and ft is set equal to 0 otherwise. For the 

determination of SOME BACKLOG, the quantity q 

In the rnoving-average algorithm may be taken as 1 
or 0 depending on whether or not any data is found 
in the virtual-circuit buffer at the given observation. 
The quantity r is then an estimate of the fraction of 
time that the buffer has been occupied during the 
past T A y seconds. The flow of control for the moni- 
tor program mat calculates SOME BACKLOG is 

entirely similar to Fig. 5. A representative but by no 
means exclusive threshold for r would be 0.5. The 
thresholds for BIG INPUT and for SOME-BACK- 
LOG need not be the same. 

BIG_BACKLOa This quantity is set equal to 1 
for a given virtual circuit at a given node or output 
router if the virtual circuit has a targe buffer length 
at the node or router, and is set equal to 0 other- 
wise. Since the lengths of buffers at bottleneck 
nodes vary stowty, smoothing of the buffer length 
is probably unnecessary. The criterion for a large 
buffer length may depend on the set of window 
sizes. If the window sizes are related by factors of 
2, a representative although not exclusive choice 

would be to set BIG BACKLOG equal to 1 if the 

instantaneous buffer length exceeds 75% of the 
current window, and equal to 0 otherwise. If the 
window sizes are equaBy spaced, a representative 

choice would be to set BIG BACKLOG equal to 1 

if the instantaneous buffer length exceeds 150% of 
the spacing between windows, and equal to 0 oth- 
orwiso. 

SPACE_CRUNCH. This quantity Is set equal 
to 1 at a given node or output router if the instanta- 
neous number of occupied cells, namely 
GLOBAL_COUNT 310, at that node or router Is 
greater than some fraction F of the total number of 
cells In the ceU queue 210 or 406 of Fig. 2 or Fig. 
4, respectively, and H is set equal to 0 otherwise. A 
representative choice would be F=7/& although 
the value of F does not appear to be critical 

Various window management protocols may be 
embodied using some or ail of the congestion 
indicators defined above. Without bnrting the 
scope of the invention, two embodiments are de- 
scribed below. In each of the two embodrments. 
each virtual circuit always has a buffer allocation at 
least as large as the minimum size Bo and it may 
have other sizes variable up to the Omit of a full 
size window The first embodiment makes use 
only of the length of a buffer at a data source (a 
router) and the . availability of free queue space at 
the nodes to manage changes in window size. The 
second embodiment makes coordinated use of 
conditions relating to buffer lengths and free queue 
space at the data source and at aD the nodes of the 
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virtual circuit 

In each of the two embodiments, it Is assumed 
that both directions of the virtual circuit traverse 
exactly the same nodes, and that each node has a 
single monitor 200 that can read and respond to 
messages carried by congestion control cells trav- 
eling in either direction, if the forward and return 
paths are logically disjoint obvious modifications of 
the protocols can be used. Instead of carrying out 
some functions on the return trip of a control mes- 
sage, one can make another traverse of the virtual 
circuit so that all changes are effected by control 
messages traveling in the forward direction. 

In the first embodiment, the flow of control in 
the program that runs In the monitor of the Input 
router 110 is shown schematically In Rg. 6. In Fig. 
6* the Quantity LIMIT refers to the existing buffer 
allocation for a particular virtual circuit The quan- 
tity WINDOW_StZE refers to a proposed new buff- 
er allocation. The input router 110 monitors the 
quantity BJGJNPUT for each of its virtual circuits 
(step 602 of Rg. 6). From time to time, as will be 
described below, it may request a change in the 
size of the window assigned to a given virtual 
circuit It makes such a request by transmitting -a 
control message over the virtual circuit (steps 608 
and 614). In the embodiment described here, the 
message is carried by a special congestion control 
cefl that is identified by a bit in its header. Alter- 
natively, the congestion control message may be 
cameo oy special bits tn a congestion new rn me 
header of an ordinary data ceil, if such a field has 
been provided. There Is no logical oSflerence be- 
tween the use of special control cells and the use 
of header fields. 

An input router that wishes to change the size 
of its window transmits a message containing the 
quantities 0. WINDOW SIZE. The initial 0 repre- 
sents a variable called ORIGIN. Messages that 
carry requests from input routers are distinguished 
by the value ORIGIN messages that carry re- 
sponses from output routers have ORIGIN = 1, as 
will appear below. WlNDOW_SlZE Is the size of 
the requested window, coded into as many bits as 
are necessary to represent the total number of 
available window sizes. By way of example, if there 
are only two possible sizes, W1NDOW_SIZE re- 
quires only a single 0 or 1 bit 

An input router that requests a new window 
size larger than its present window size (steps 812, 
614) does not begin to use the new window size 
until it has received confirmation at step 618 (as 
described below. On the other hand, a router does 
not request a window size smaller than its current 
allocation until it has already begun to use the 
smaller window (step 606). Since switch nodes can 
always reduce buffer allocations that are above the 
tnraai winoow size, confirmation or a request tor a 



smaller window is assured. 

When the node controller 212 of a switching 
node along the forward path receives a control 
message containing 0,WlNDOW_SIZE, it pro- 

5n <~> fin n r> ■waaa^w^a n_ri h uu w^i %& 44h^v thA/lA Ji— n_n.§ir_ri.i 

cesses me message as tojiows. it me noae control- 
ler can make the requested buffer allocation it does 
so, and passes the message to the next node 
without change. If there is insufficient unallocated 
space in the free 1st to meet the request the node 
to allocates as large a buffer size as it can, the 
minimum being the current buffer size. In either 
case, the controller writes the value of 
WlNDOW_SIZE that It can aoow Into the message 
before passing it along to the next node. The 
rs output router also meets the requested value of 
WlNDOW__SJZE as nearly as it can. sets ORIGIN 

^_ 4 Iwfc 1m* rlt nrttfi +m. Win nnrtnnn mju-n rufcjx n «~i r~t t ■ < > ii jinn, f ij~L 

— i to incuca© a response message, ana transmits 
the response containing the final value of 
WINDOWJSIZE to the first switching node on the 

20 return path. Node controllers on the return path 
read ORIGIN = 1 and the WINDOW__SIZE field 
and adjust their allocations a ccor di ngly. The adjust- 
ments involve, at most downward allocations for 
nodes that met the original request before some 

25 node failed to do so. When the input router re- 
ceives a control message containing 
1,W1ND0W_SIZE, it knows that a set of buffer 
a u ocaoons consistent wnn me value 
WINDOW_SlZE exist along the whole path. 

so A newty opened virtual circuit has a buffer 
allocation Bo at each node and has a window of 
size B©. The input router should request an In- 
crease In window size as soon as it observes that 
BK3_INPUT a 1. After requesting a window 

35 change and receiving a response, the input router 
may watt for some period of time O, such as 10 to 
100 round-trip times, before inspecting 
BIG_INPUT again. Then if BIG_INPUT - 1, It 
may ask for another increase in window size, or if 

40 BIG_JNPUT ■ 0, ft may ask for a decrease. If a 
decrease is called for, the input router does not 
issue the request until the amount of outstanding 
data on the virtual circuit win fit Into the smaller 
window, and from that time on ft observes the new 

45 window restriction. The actual allocation Is not 
changed until the value of UMIT Is set equal to the 
value of WINDOW__SIZE (steps 608, 618). 

The flow of control in the program that runs in 
the monitor of a switching node 1 14, in response to 

so the arrival of a congestion control cell from' either 
direction, Is depleted in Rg. 7. Step 700 changes 
UMIT to match the requested window size as 
closely as possible. Step 702 writes the new value 
of UMIT into the control cell and passes the mes- 

66 sage along to the next node In the virtual circuit 

The previous embodbnent has made use only 
of congestion information at the input router. A 
second embodiment employs a protocol that co- 
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ordinates congestion information across the entire 
circuit in order to pick a new window size if one is 
needed. It uses a two-phase signaling procedure, in 
which the first phase sets up the new window and 
the second phase resolves any discrepancies that s 
may exist among the new window and the buffer 
allocations at the various nodes. The logical steps 
carried out by the input and output routers and by 
the switching nodes are illustrated schematically in 
Figs. 8 through 12. 10 

The protocol for the second embodiment uses 
the quantities ORIGIN. BK3_JNPUT, 
SOME_BACKLOG. BIG_BACKLOG, and 
SPACE_CRUNCH that were defined earlier. Since 
the protocol uses two phases of signaling, it re- is 
quires one extra binary quantity. PHASE, which 
takes the value 0 for Phase 1 and 1 for Phase 2. In 
Phase 1, the input router 110 initiates a control 
message carrying a 6-bit field that consists of the 
quantities ORIGIN « 0, PHASE =0, BK3_INPUT. so 
SPACE_CRUNCH=0, SOME_BACKLOG=0, 
BK3_BACKLOG - 0. The flow of control for the 
Input router is depicted in Rg. a 

The flow of control for a node controller Is 
shown in Rg. 9. When a node controller receives a 26 
Phase 1 control message, it inspects the values of 
SPACE_CRUNCH (step 900), SO M E__BACKLOG 
(step 904), and BK3_BACKLOG (step 910). and if 
its own value of the given quantity is 0. it passes 
that field unchanged. If its own value of the quantity so 
is 1. it writes 1 into the corresponding field, as 
shown in Rg. 9 (steps 902. 906. 910), before 
transmitting the control message to the next 
switching node (step 912). 

When the receiving router 120 receives a 35 
Phase 1 control message, ft first combines its own 
values of SPACE_CRUNCH. SOME_BACKLOG, 

and BIG BACKLOG with the values in the arriving 

message, Just as the switching nodes have done. 
The receiving router then inspects the test four bris 40 
of the modified message and calculates a pro- 
posed value of WlNDOW__SIZE according to the 
four cases below, using the logic flow shown in Rg. 
10. 

1) If BIG__INPUT=1 and « 
SOME_BACKLOG=0 (step 1000), then Increase 

the window size. 

The virtual circuit is nowhere bottJenecked by 
the round robin scheduler and the virtual circuit 
would Ska to send at a faster rate; it is being so 
unnecessarily throttled by its window. 

2) if BK3_BACKLOG = 1 and 
SPACE_CRUNCH»1 (steps 1002,1004). then re- 
duce the window size. 

Some node Is bottJenecked by the round robin ss 
scheduler and a big buffer has built up there, so 
the window Is unnecessarily big; and some node Is 
running out of space. 



3) If BIG_INPUT=0 and 
SOME__BACKLOG=0 and SPACE__CRUNCH - 1 
(step 1006), then reduce the window size. 

The virtual circuit has a light offered load, so it 
does not need a big window to carry the toad; and 
some node is running out of space. 

4) In all other cases (step 1008). the present 
window size Is appropriate. 

The receiving router then transmits the Phase 
1 control message to the first switching node on 
the return path (step 1012). The response message 
contains the fields ORIGINAL PHASE =0, 
WJNDOW_SIZE, where the test field is a binary 
encoding of the recommended window size. Each 
node controller 212 on the return path looks at the 
control message and takes the action shown in Rg. 
11. If an increased allocation is requested (step 
1100). the node makes the allocation if it can (step 
1102). If it cannot make the requested allocation, it 
makes whatever allocation it can make, the mini- 
mum being the present buffer size, and writes the 
allocation it has made Into the WlNDOW_SlZE 
field (step 1104). The node then transmits the 
control message to the next node on the return 
path (step 1106). If the request is for a decreased 
allocation, the node does not make the decrease 
yet but it passes the WlNDOW_SIZE field along 
unchanged. 

When the transmitting router receives the 
Phase 1 response message (step 804),the 
WlNDOW_SJZE field indicates the window that the 
virtual circuit is going to have. If there Is an in- 
crease over the present window size, it is available 
immediately. If there is a decrease, trie transmitting 
router waits for the amount of unacknowledged 
data in the virtual circuit to drain down to the new 
window size, as shown in Rg. 8 at step 806. Then 
it transmits a Phase 2 control message with the 
fields ORIGIN =0, PHASE=1, WlNDOW_SlZE 
(Step 810). Node controllers receiving this mes- 
sage take the action shown in Rg. 12. They adjust 
their buffer allocations downward, if necessary, to 
the value of W1NDOW__SlZE (step 1200), and pass 
the control message along unchanged (step 1202). 
The receiving router returns a Phase 2 response 
message with the fields ORIGINAL PHASE=1. 
WlNDOW_SlZE The switching nodes simply pass 
this message along, since Its only purpose is to 
notify the transmitting router that a consistent set of 
buffer allocations exists along the entire virtual cir- 
cuit 

After completing Phase 2. the transirtttir^ rout- 
er watte for a while, as shown at step 816 in Rg. a 
before beginning Phase 1 again. Rrst it wafts until 
either a window's worth of data has been transmit- 
ted since the end of Phase 2 or a certain period of 
time D. such as 10 to 100 round-trip times, has 
elapsed since the end of Phase 2. whichever 
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comes first Then, if the present window size is 
greater than the minimum window size B e (step 
818) or if BIG_JNPUT « 1 (step 800), Phase 1 
begins immediately; otherwise, Phase 1 begins as 
soon as BIG_INPUT = 1. A newly opened virtual 
circuit whose Initial window size and buffer alloca- 
tions are Bo, should begin Phase 1 as soon as 
BIG INPUT = 1. if ever. 



Claims 

1. A method of controlling the congestion of data 
cells in a network having a plurality of switch- 
ing nodes and a plurality of incoming virtual 
circuits at each node, said method comprising 
the steps of assigning an initial cell buffer to 
each virtual circuit at each node, 

storing Incoming cells for virtual circuits in 
their respective buffers and removing cells 
from the buffers for forward routing, character- 
ized by 

dynamically allocating buffer space selec- 
tively to ones of the incoming circuits at a 
node in response to signals requesting in- 
creased or decreased data window sizes, re- 
spectively. 

2. The method of claim 1 wherein the step of 
assigning an tnmai otrner runner comprises 

assigning an initial buffer of predetermined 
size to each virtual circuit 

3. The method of claim 2 wherein the predeter- 
mined size of the initial buffer is less than the 
size of a full data window, wherein a fUB data 
window is defined as the product of the maxi- 
mum transmission bit rate of the virtual circuit 
multiplied by a nominal factor representing 
round trip propagation time in the network 

4. The method of claim 1 wherein the step of 
dynamically allocating buffer space to a virtual 
circuit further compri s es allocating a full data 
window in response to a signal requesting a 
larger buffer space, wherein a full data window 
is defined as the product of the maximum 
transmission bit rate of the virtual circuit mufti- 
piioa oy a nominal ractor representing rouno 
nip propaQanon time m tne netwonc 

Bm Tho method of claim 4 further comprising tho 
Sifio of rGQUdsxint) & XsroBt DuffBr spqcb bssoo 
on the amount of data waiting to be sent for 
the said virtual circuit at the cell source. 

8. The method of claim 4 wherein the stop of 
dynamically allocating a fuD data window fur- 



ther comprises determining if sufficient free 
buffer space exists at each node of the virtual 
circuit to perform the allocation and denying 
the request otherwise. 

5 

7. The method of claim 1 wherein the step of 
dynamically allocating buffer space further 
comprises allocating space to a virtual circuit 
in one or more blocks of fixed size. 

10 

a The method of claim 1 wherein the step of 
allocating buffer space further comprises al- 
locating space to a virtual circuit in blocks of 
variable see. 

is 

9. The method of claim 7 or claim 8 further 
comprising tho step of determining the size of 
a block to be allocated at each node of a 
virtual circuit based on tho amount of data 

so waiting to be sent for the said virtual circuit at 
the cefl source. 

10. The method of claim 9 wherein the step of 
dynamically allocating buffer space In re- 

26 sponse to a request for a larger buffer further 
comprises oerarmining rr sufficient tree Doner 
space exists at each node of the virtual circuit 
to perform the allocation and denying the re- 
quest otherwise. 

30 

11. The method of claim 9 further comprising the 
step of determining the size of a block to be 
allocated based on the amount of packet data 
already buffered for the said virtual circuit at 

35 each said node. 

12. The method of claim 11 further comprising the 
step of determining the size of a block to be 
allocated at each node of a virtual circuit based 

40 on the amount of free buffer space at each 
S3KJ noGo. 

13. The method of claim 11 wherein the step of 
dynamically allocating buffer space in re- 

46 sponse to a request for a larger buffer further 
comprises determining if sufficient free buffer 
space exists at each node of the virtual circuit 
to perform the allocation and denying the re- 

60 

14. The method of ctedm 13 wherein the step of 
oetemuntng n sumaeni tree butter space ex- 
ists at each node further comprises 

transmitt in g a control message along a vtr- 
65 tual circuit from the first node in the circuit to 
the last node In the circuit 

wnnng uiiuinaoon mo me control mes- 
sage as it passes through each node desert b- 
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ing the amount of free buffer space that can be 
allocated at the node, and 

selecting the amount of buffer space as- 
signed to the virtual circuit at each node to be 
equal to the smallest amount available at any s 
node of the virtual circuit based on the final 
results in the control message. 

15. The method of claim 13 wherein the step of 
determining if sufficient free buffer space ex- 10 
ists at each node further comprises 

transmitting control message along a vir- 
tual circuit from the first node in the circuit to 
the last node in the circuit the control mes- 
sage containing information representing is 
whether a large or small amount of data is 
buffered at the Inrtiai node for the virtual circuit 
and Information representing the availability of 
free buffer space at the initial node, 

overwriting said Information in the control 20 
message with new information as ft passes 
through each node if the new information at a 
node is more restrictive, and 

selecting the amount of buffer space as- 
signed to the virtual circuit at each node based 2s 
on the final results in the control message. 

16. The method of cferim 14 wherein the step of 
determining if sufficient free buffer space ex- 
ists at each node further comprises so 

performing the selecting step at the final 
node, and 

returning a second control message from 
the last node through each node of the virtual 
circuit and as 

adjusting the allocation at each node in 
response to the second control message. 

17. The method of claim 14 wherein the step of 
determining if sufficient free buffer space ex- *o 
ists at each node further comprises 

returning the control message from the last 
node In the virtual circuit to the first node in 
the virtual circuit 

performing the selecting step at the Initial « 
node, 

transmitting a second control message 
from the first node to the last node to perform 
the allocation. 

so 

18. The method of claim 2 wherein the size of the 
initial ceil buffer is equal to the size of a full 
data window dMded by the square root of the 
maximum number of virtual circuits that can 
simultaneously exist in any node. sa 

19l The method of claim 1 or claim 2 or claim 3 or 
claim 4 or claim 7 or claim 8 further compris- 



ing the step of discarding data for a virtual 
circuit during buffer overflow for the said virtual 
circuit 

2a The method of claim 1 or claim 2 or claim 3 or 
claim 4 or claim 7 or claim 8 further compris- 
ing the step of requesting a reduction in the 
allocated buffer space for the virtual circuit 
after a prior increase of the buffer space above 
the initial buffer space based on the amount of 
data waiting to be sent for the said virtual 
circuit at the cell source. 

21. The method of claim 20 wherein the step of 
requesting a reduction in the allocated buffer 
space for the virtual circuit after a prior in- 
crease of the buffer space above the Initial 
buffer space is further based on the amount of 
data already buffered for the said virtual drcuit 
at each said node. 

22. The method of claim 21 wherein the step of 
requesting a reduction in the allocated buffer 
space for the virtual circuit after a prior in- 
crease of the buffer space above the inrtiai 
buffer space is further based on the amount of 
free buffer space at each said node. 

23. The method of claim 1 wherein the step of 
dynamically allocating buffer space further 
comprises 

transmitting a control message along a vir- 
tual circuit from the first node in the circuit to 
the last node in the circuit 

writing information into the control mes- 
sage as ft passes through each node describ- 
ing the amount of free buffer space that can be 
allocated at the node, and 

selecting the amount of buffer space as- 
signed to the virtual circuit at each node to be 
equal to the smallest amount available at any 
node of the virtual circuit based on the final 
results in the control message. 

24. The method of claim 23 wherein the step of 
dynamically allocating buffer space further 
comprises 

performing the selecting step at the final 
node, and 

returning a second control message from 
the last node through each node of the virtual 
circuit ana 

adjusting the allocation at each node in 
response to the second control message. 

25. The method of claim 24 wherein the step of 
determining if sufficient free buffer space ex- 
ists at each node further comprises 
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returning the control message from the lad 
node in the virtual circuit to the first node in 
the virtual circuit 

performing the selecting step at the initial 
node. 

transmitting a second control message 
from the first node to the last node to perform 
the allocation. 
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